220041006 Yihang Li
using Google Trends, construct a weekly index to capture political relations between U.S. and China from the US perspective, draw the variable in a graph, and discuss its time-series variation
Timeline-U.S. Relations With China
Note: the index in google trends is already weekly
!datapane login --server=https://datapane.com/ --token=<d95b6dc19091f789ac6af9a7026844a49c7b68c5>
/bin/bash: -c: line 0: syntax error near unexpected token `newline' /bin/bash: -c: line 0: `datapane login --server=https://datapane.com/ --token=<d95b6dc19091f789ac6af9a7026844a49c7b68c5>'
import pandas as pd
from pytrends.request import TrendReq
import altair as alt
import plotly.express as px
use pytrends for automating downloading of reports from Google Trends.
# Take the keyword 'Trade War' as an example
pytrends = TrendReq(hl='en-US', tz=360)
keyword = 'Trade War'
pytrends.build_payload([keyword], cat=0, timeframe='today 5-y', geo='', gprop='')
top_queries = pytrends.related_queries()[keyword]['top']
fig = px.bar(top_queries,
x='query',
y='value')
fig.show()
r_o_t = pytrends.interest_over_time().reset_index()
r_o_t.rename(columns={keyword : 'search_volume'}, inplace=True)
# We can see from the date attribute, the data is already the weekly data
r_o_t.tail(6)
| date | search_volume | isPartial | |
|---|---|---|---|
| 255 | 2020-09-20 | 14 | False |
| 256 | 2020-09-27 | 12 | False |
| 257 | 2020-10-04 | 15 | False |
| 258 | 2020-10-11 | 13 | False |
| 259 | 2020-10-18 | 11 | False |
| 260 | 2020-10-25 | 12 | True |
r_o_t_plot = alt.Chart(r_o_t).encode(x='date', y='search_volume').mark_area(line=True).interactive().properties(width=700)
r_o_t_plot
using Google Trends, construct a weekly index to predict the outcome of the 2020 US presidential election, draw the variable in a graph, and tell us who are more likely to win the final election.
Reference:
kw_list = ['donald trump', 'joe biden']
kw_group = list(zip(*[iter(kw_list)]*1))
print(kw_group)
kw_grplist = [list(x) for x in kw_group]
print(kw_grplist)
#Since, every search term in a list of lists,
#build_payload() method will have to check every query one by one through our for loop.
trendshow = TrendReq(hl='en-US', tz=360)
dict = {}
i = 0
for kw in kw_grplist:
trendshow.build_payload(kw, timeframe = 'today 12-m', geo='US')
dict[i] = trendshow.interest_over_time()
i += 1
trendframe = pd.concat(dict, axis=1)
trendframe.columns = trendframe.columns.droplevel(0)
trendframe = trendframe.drop('isPartial', axis = 1)
trendframe.tail()
[('donald trump',), ('joe biden',)]
[['donald trump'], ['joe biden']]
| donald trump | joe biden | |
|---|---|---|
| date | ||
| 2020-09-27 | 100 | 100 |
| 2020-10-04 | 75 | 54 |
| 2020-10-11 | 41 | 53 |
| 2020-10-18 | 48 | 75 |
| 2020-10-25 | 43 | 65 |
import plotly
import plotly.graph_objects as go
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
import plotly.offline as pyo
init_notebook_mode(connected=True)
trace = [go.Scatter(
x = trendframe.index,
y = trendframe[col], name=col) for col in trendframe.columns]
data = trace
layout = go.Layout(title='Post', showlegend=True)
fig = go.Figure(data=data, layout=layout)
iplot(fig)
trace = [go.Bar(
x = trendframe.index,
y = trendframe[col], name=col) for col in trendframe.columns[0:2]]
data = trace
layout = go.Layout(title='Post', showlegend=True)
fig = go.Figure(data=data, layout=layout)
iplot(fig)
Joe Biden is more likely to win the final election.
using Baidu Index, construct an index to capture investor sentiment in the Chinese market, draw the variable in a graph, and discuss its time-series variation.
Reference:
[1] 如何利用百度指数进行数据分析?
Here we choose the keyword '黄金', which in English means Gold, as our index to capture investor sentiment in the Chinese market
According to Baidu Search Index, the graph below is its search trend of Baidu user in China
Note:
1. Clearly, the time series variation of '黄金''s search trend is in certain period.
2. Besides, at the point of 2020-08-12, there is a peek search value that is much higher than any other points. By looking for the why this peek appeared, we got to know the reason is that the price of the Gold dropped sharply at that time.
3. Moreover, by comparing with keywords '美股' and 'a股' as follows, we can see the period is very similar. And we guess that it is due to the trade time related reason.